线性神经网络层的模棱两可。在这项工作中,我们放宽了肩variance条件,只有在投影范围内才是真实的。特别是,我们研究了投射性和普通的肩那样的关系,并表明对于重要的例子,这些问题实际上是等效的。3D中的旋转组在投影平面上投影起作用。在设计用于过滤2D-2D对应的网络时,我们在实验上研究了旋转肩位的实际重要性。完全模型的模型表现不佳,虽然简单地增加了不变的特征,从而在强大的基线产量中得到了改善,但这似乎并不是由于改善的均衡性。
translated by 谷歌翻译
本文的目的是证明,通过简单地用可符合的CNN替换骨干CNN,可以使旋转更具旋转状态,以使其与翻译和图像旋转一样。实验表明,这种提升是在不降低普通照明和观点匹配序列上的性能的情况下获得的。
translated by 谷歌翻译
在本文中,我们涉及在2D点云数据上的旋转设备。我们描述了一种特定的功能,能够近似任何连续旋转等级和置换不变函数。基于这一结果,我们提出了一种新的神经网络架构,用于处理2D点云,我们证明其普遍性地用于近似呈现这些对称的功能。我们还展示了如何扩展架构以接受一组2D-2D对应关系作为Indata,同时保持类似的标准性属性。关于立体视觉中必需基质的估计的实验。
translated by 谷歌翻译
Objective: Imbalances of the electrolyte concentration levels in the body can lead to catastrophic consequences, but accurate and accessible measurements could improve patient outcomes. While blood tests provide accurate measurements, they are invasive and the laboratory analysis can be slow or inaccessible. In contrast, an electrocardiogram (ECG) is a widely adopted tool which is quick and simple to acquire. However, the problem of estimating continuous electrolyte concentrations directly from ECGs is not well-studied. We therefore investigate if regression methods can be used for accurate ECG-based prediction of electrolyte concentrations. Methods: We explore the use of deep neural networks (DNNs) for this task. We analyze the regression performance across four electrolytes, utilizing a novel dataset containing over 290000 ECGs. For improved understanding, we also study the full spectrum from continuous predictions to binary classification of extreme concentration levels. To enhance clinical usefulness, we finally extend to a probabilistic regression approach and evaluate different uncertainty estimates. Results: We find that the performance varies significantly between different electrolytes, which is clinically justified in the interplay of electrolytes and their manifestation in the ECG. We also compare the regression accuracy with that of traditional machine learning models, demonstrating superior performance of DNNs. Conclusion: Discretization can lead to good classification performance, but does not help solve the original problem of predicting continuous concentration levels. While probabilistic regression demonstrates potential practical usefulness, the uncertainty estimates are not particularly well-calibrated. Significance: Our study is a first step towards accurate and reliable ECG-based prediction of electrolyte concentration levels.
translated by 谷歌翻译
Active learning as a paradigm in deep learning is especially important in applications involving intricate perception tasks such as object detection where labels are difficult and expensive to acquire. Development of active learning methods in such fields is highly computationally expensive and time consuming which obstructs the progression of research and leads to a lack of comparability between methods. In this work, we propose and investigate a sandbox setup for rapid development and transparent evaluation of active learning in deep object detection. Our experiments with commonly used configurations of datasets and detection architectures found in the literature show that results obtained in our sandbox environment are representative of results on standard configurations. The total compute time to obtain results and assess the learning behavior can thereby be reduced by factors of up to 14 when comparing with Pascal VOC and up to 32 when comparing with BDD100k. This allows for testing and evaluating data acquisition and labeling strategies in under half a day and contributes to the transparency and development speed in the field of active learning for object detection.
translated by 谷歌翻译
Workplace injuries are common in today's society due to a lack of adequately worn safety equipment. A system that only admits appropriately equipped personnel can be created to improve working conditions. The goal is thus to develop a system that will improve workers' safety using a camera that will detect the usage of Personal Protective Equipment (PPE). To this end, we collected and labeled appropriate data from several public sources, which have been used to train and evaluate several models based on the popular YOLOv4 object detector. Our focus, driven by a collaborating industrial partner, is to implement our system into an entry control point where workers must present themselves to obtain access to a restricted area. Combined with facial identity recognition, the system would ensure that only authorized people wearing appropriate equipment are granted access. A novelty of this work is that we increase the number of classes to five objects (hardhat, safety vest, safety gloves, safety glasses, and hearing protection), whereas most existing works only focus on one or two classes, usually hardhats or vests. The AI model developed provides good detection accuracy at a distance of 3 and 5 meters in the collaborative environment where we aim at operating (mAP of 99/89%, respectively). The small size of some objects or the potential occlusion by body parts have been identified as potential factors that are detrimental to accuracy, which we have counteracted via data augmentation and cropping of the body before applying PPE detection.
translated by 谷歌翻译
Current state-of-the-art deep neural networks for image classification are made up of 10 - 100 million learnable weights and are therefore inherently prone to overfitting. The complexity of the weight count can be seen as a function of the number of channels, the spatial extent of the input and the number of layers of the network. Due to the use of convolutional layers the scaling of weight complexity is usually linear with regards to the resolution dimensions, but remains quadratic with respect to the number of channels. Active research in recent years in terms of using multigrid inspired ideas in deep neural networks have shown that on one hand a significant number of weights can be saved by appropriate weight sharing and on the other that a hierarchical structure in the channel dimension can improve the weight complexity to linear. In this work, we combine these multigrid ideas to introduce a joint framework of multigrid inspired architectures, that exploit multigrid structures in all relevant dimensions to achieve linear weight complexity scaling and drastically reduced weight counts. Our experiments show that this structured reduction in weight count is able to reduce overfitting and thus shows improved performance over state-of-the-art ResNet architectures on typical image classification benchmarks at lower network complexity.
translated by 谷歌翻译
基于深度学习的计算机辅助检测系统在乳腺癌检测中表现出良好的性能。但是,高密度的乳房显示出较差的检测性能,因为密集组织可以掩盖甚至模拟质量。因此,乳腺癌检测的敏感性可在致密乳房中降低20%以上。此外,与低密度乳房相比,极度致密的病例报告说,患癌症的风险增加。这项研究旨在使用合成高密度的全场数字乳房X线照片(FFDM)作为乳腺质量检测模型训练期间的数据增强来提高高密度乳房的质量检测性能。为此,对使用三个FFDM数据集进行了五个周期一致的GAN(CycleGAN)模型,以高分辨率乳房X线照片中的低密度图像翻译进行了训练。训练图像是由乳房密度双拉德类别分开的,几乎是脂肪的脂肪,双刺是乳房的乳房。我们的结果表明,所提出的数据增强技术在两个不同的测试集中提高了高密度乳房中质量检测的敏感性和精度,并将其作为域适应技术有用。此外,在一项涉及两名专家放射科医生和一名外科肿瘤学家的读者研究中评估了合成图像的临床现实主义。
translated by 谷歌翻译
在样本量有限的域中,有效的学习算法至关重要。使用特权信息(LUPI)学习,通过允许预测模型在培训时间访问信息类型,从而提高了样本效率,而在使用模型时,这是不可用的。在最近的工作中,有证据表明,对于线性高斯动力学系统的预测,具有中间时间序列数据访问的卢比学习者永远不会比任何公正的经典学习者更糟糕,而且常常更好。我们为该分析提供了新的见解,并将其推广到潜在动力学系统中的非线性预测任务,从而将理论保证扩展到连接潜在变量和观察值的地图已知到线性变换的情况下。此外,我们提出了基于随机特征和表示该地图未知的情况的表示算法。一套经验结果证实了理论发现,并显示了在非线性预测中使用特权时间序列信息的潜力。
translated by 谷歌翻译
视觉变压器已经证明了在各种视觉任务中胜过CNN的潜力。但是这些模型的计算和内存要求禁止在许多应用中使用它们,尤其是依赖高分辨率图像的应用程序,例如医学图像分类。更有效地训练VIT的努力过于复杂,需要进行建筑变化或复杂的培训方案。在这项工作中,我们表明可以通过随机删除输入图像贴片来有效地以高分辨率进行标准VIT模型。这种简单的方法(PatchDropout)在标准的自然图像数据集(例如ImageNet)中将拖鞋和内存减少至少50%,而这些节省仅随图像尺寸而增加。在高分辨率医疗数据集CSAW上,我们使用PatchDropout可节省5倍的计算和内存,并提高性能。对于具有固定计算或内存预算的从业人员,PatchDropout可以选择图像分辨率,超参数或模型大小以使其从模型中获得最大的性能。
translated by 谷歌翻译